Add generic gRPC stream forwarding#306
Conversation
|
I'm working on the tests, but that is not ready yet. |
|
To be able to pass Bazel's build event stream (BES) to the same DNS name, without having to add an extra L7 router in front of the bb-storage frontend, add a configuration to forward specific gRPC methods to other backends. No authorization is possible on the passed through messages because Buildbarn has no knowledge about the semantics of the forwarded messages. The gRPC reflection service has also been extended to forward requests that cannot be resolved locally.
dbaf6e8 to
75d27e9
Compare
pkg/grpc/simple_stream_forwarder.go
Outdated
| group.Go(func() error { | ||
| for { | ||
| msg := &emptypb.Empty{} | ||
| if err := outgoingStream.RecvMsg(msg); err != nil { |
There was a problem hiding this comment.
As far as I know, the only way to properly cleaning up the resources associated with a stream is to call RecvMsg() until a non-nil value is returned. I don't think that this implementation guarantees that, as we return immediately if incomingStream.SendMsg() fails.
Maybe instead of using errgroup.Group we should give in and use Buildbarn's own program.Group? Then we can do something like this:
There was a problem hiding this comment.
According to https://pkg.go.dev/google.golang.org/grpc#ClientConn.NewStream, cancelling the context is enough. program.Group does give that guarantee, but as incomingStream.RecvMsg can block until the whole method returns, a go func is needed. That go func needs to be able to cancel the context, but may try to cancel after the whole method has returned. program.Group does not support that, so I stick with errgroup.
|
I've tested the docker-compose deployment, adding the following to Then I successfully ran: |
pkg/grpc/reflection_relay.go
Outdated
| "github.com/jhump/protoreflect/v2/grpcreflect" | ||
| "github.com/jhump/protoreflect/v2/protoresolve" | ||
| v1reflectiongrpc "google.golang.org/grpc/reflection/grpc_reflection_v1" | ||
| v1alphareflectiongrpc "google.golang.org/grpc/reflection/grpc_reflection_v1alpha" |
There was a problem hiding this comment.
Is it common practice to still use the v1alpha protocol? If not, what are your thoughts on omitting it?
There was a problem hiding this comment.
grpc-java added support for v1 as late as 2024. v1alpha proto file has been set as deprecated since three years. Still, grpc-go reflection.Register() uses both APIs. grpcurl added support for v1 in 2022, so I'm find with removing v1alpha support.
Three years for upgrading grpcurl should be enough. I'll remove v1alpha.
| resolver := grpcreflect.NewClientAuto(backendCtx, grpcClient).AsResolver() | ||
| reflectionBackends = append(reflectionBackends, resolver) | ||
| } | ||
| combinedRemoteResolver := protoresolve.Combine(reflectionBackends...) |
There was a problem hiding this comment.
Question: What happens if one of the backends is down? Does this mean reflection stops functioning altogether?
There was a problem hiding this comment.
Yes, for services bind that backend in the list. The combined resolver calls the reflection backends in order until something else than NotFound is returned. Is that acceptable?
|
Thank you for review. Busy today, will reply tonight. |
|
@EdSchouten are we ready to merge this? (No hurry, is the holiday season.) |
|
@EdSchouten Friendly ping. |
0e006a9 to
080d955
Compare
|
@EdSchouten Friendly ping. |
Commit 5b5db75, "Add generic gRPC stream forwarding (buildbarn#306)", introduced a flakiness in the grpc_test due to calling `errgroup.Go(func() error { return err })` when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to `errgroup.Go()` being called just when `errgroup.Wait()` was about to return, which is not allowed.
Commit 5b5db75, "Add generic gRPC stream forwarding (buildbarn#306)", introduced a flakiness in the grpc_test due to calling `errgroup.Go(func() error { return err })` when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to `errgroup.Go()` being called just when `errgroup.Wait()` was about to return, which is not allowed.
Commit 5b5db75, "Add generic gRPC stream forwarding (buildbarn#306)", introduced a flakiness in the grpc_test due to calling `errgroup.Go(func() error { return err })` when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to `errgroup.Go()` being called just when `errgroup.Wait()` was about to return, which is not allowed.
Commit 5b5db75, "Add generic gRPC stream forwarding (buildbarn#306)", introduced a flakiness in the grpc_test due to calling `errgroup.Go(func() error { return err })` when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to `errgroup.Go()` being called just when `errgroup.Wait()` was about to return, which is not allowed.
Commit 5b5db75, "Add generic gRPC stream forwarding (buildbarn#306)", introduced a flakiness in the grpc_test due to calling `errgroup.Go(func() error { return err })` when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to `errgroup.Go()` being called just when `errgroup.Wait()` was about to return, which is not allowed.
Commit 5b5db75, "Add generic gRPC stream forwarding (#306)", introduced a flakiness in the grpc_test due to calling `errgroup.Go(func() error { return err })` when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to `errgroup.Go()` being called just when `errgroup.Wait()` was about to return, which is not allowed.

To be able to pass Bazel's build event stream (BES) to the same DNS name, without having to add an extra L7 router in front of the bb-storage frontend, add a configuration to forward specific gRPC methods to other backends. No authorization is possible on the passed through messages because Buildbarn has no knowledge about the semantics of the forwarded messages.
The gRPC reflection service has also been extended to forward requests that cannot be resolved locally.